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An efficient variational principle for the direct optimization of excited states 
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We present a variational function that targets excited states directly based on their position in 
the energy spectrum, along with a Monte Carlo method for its evaluation and minimization whose 
cost scales polynomially for a wide class of approximate wave functions. Being compatible with 
both real and Fock space and open and periodic boundary conditions, the method has the potential 
to impact many areas of chemistry, physics, and materials science. Initial tests on doubly excited 
states show that using this method, the Hilbert space Jastrow antisymmetric geminal power ansatz 
can deliver order-of-magnitude improvements in accuracy relative to equation of motion coupled 
cluster theory, while a very modest real space multi-Slater Jastrow expansion can achieve accuracies 
within 0.1 eV of the best theoretical benchmarks for the carbon dimer. 


The ground state variational principle is probably the 
most important technique in modern electronic structure 
theory. Through its roles in optimizing Slater determi¬ 
nants in Hartree Fock (HE) [T] and density functional 
theory (DFT) [2], the matrix product state (MPS) in den¬ 
sity matrix renormalization group (DMRG) [2111], trial 
functions in variational Monte Carlo (VMC) |S], and lin¬ 
ear combinations in configuration interaction (Cl) [B] , it 
exists as a critical element in the vast majority of ground 
state electronic structure methods used today. Its success 
rests on the existence of a function 
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whose global minimum is the Hamiltonian’s ground 
eigenstate. This function provides a metric telling us 
which parameterization of an approximate ansatz is clos¬ 
est to the true ground state, thus allowing us to optimize 
the ansatz’s full variational freedom for that state alone 
without regard to the description of any other state. In 
practice, of course, we are constrained in our choice of 
ansatz to those permitting an efficient evaluation of E. 
This constraint notwithstanding, the ground state varia¬ 
tional principle has become an essential part of most elec¬ 
tronic structure methods, even those like coupled cluster 
(CC) theory |7] whose practical application involves non- 
variational methods as well. 


To date, the lack of an efficient analogous function for 
excited states has hindered the development of methods 
that can target such states in the same individual and 
variational way. Instead, existing excited state meth¬ 
ods typically require an ansatz to use its variational free¬ 
dom to satisfy the needs of many eigenstates simultane¬ 
ously, the difficulty of which has limited our predictive 
power over the doubly-excited states in light harvesting 
systems, the spectra of excited state absorption experi¬ 
ments, and the band gaps of transition metal oxides. For 
example, linear response (LR) methods such as time de- 
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pendent HE and DFT |S], Cl singles (CIS) |S], equation 
of motion CC with singles and doubles (EOM-CCSD) |9], 
and LR DMRC [TUI[T2] are limited by the requirement 
that all excited states of interest must be found in the 
ground state’s LR space, which for a nonlinear ansatz is 
typically much less flexible than its full variational space. 
In many other cases, such as state-averaged complete ac¬ 
tive space methods Haul, some VMC approaches m, 
and directly targeted DMRC m, crucial ansatz compo¬ 
nents (often the one particle basis) are required to be the 
same for the ground and all excited states. While these 
methods clearly do not take full advantage of an ansatz’s 
variational freedom, they have been preferred due to the 
lack of an efficiently optimizable function whose mini¬ 
mum is an excited state. 


This report presents a new variational principle con¬ 
sisting of two parts: first, a function D('I') whose global 
minimum is an excited eigenstate, and second, a method 
for evaluating and minimizing D whose cost scales poly¬ 
nomially for a wide class of approximate wave functions. 
We will begin by proving that Ul has the necessary proper¬ 
ties to be the basis of an excited state variation principle, 
after which we will detail our method for minimizing it. 
During this discussion, we will explain which wave func¬ 
tions are compatible with the approach, as well as its 
general applicability in chemistry, physics, and materials 
science. Finally, we will present numerical examples that 
demonstrate the method’s potential. 


We employ the function 


D(vI/) = 
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where the energy shift uj is assumed to be placed in be¬ 
tween distinct eigenvalues of H in order to target the 
eigenstate whose eigenvalue is immediately above uj in 
the ordered eigenvalue spectrum. Assuming real num¬ 
bers for brevity, we proceed to prove that this eigenstate 
is the global minimum of D as follows. First, we write 
an exact ansatz as a linear combination of all eigenstates 
of H, jlke) = and rewrite D in terms of H's 





2 


eigenvalues. 
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Differentiating with respect to the elements of c, we see 
that c is a stationary point (SP) if and only if 

Q = c,{uj - Ei){l - {u - E,)n) V i. (4) 


Recalling that uj is assumed to be distinct from any of iJ’s 
eigenvalues, we see that c cannot be a SP if any two of its 
elements that correspond to distinct Hamiltonian eigen¬ 
values are non-zero, as this would prevent (1 — (w — 
from vanishing for both of them. In other words, Idte) 
cannot be a SP of H unless the nonzero values in c all 
correspond to one (possibly degenerate) eigenvalue of E[. 
As the eigenstates of H are clearly SPs of 0(c), we see 
that Idte) is a SP if and only if it is an eigenstate. We 
thus conclude that the global minimum of 0(c) is the 
eigenstate whose eigenvalue is immediately above w, or a 
linear combination of such eigenstates if this eigenvalue 
is degenerate. As l^tg) can describe any state in Hilbert 
space, this minimum value will be less than or equal to 
that of any approximate ansatz, thus achieving the vari¬ 
ational property we desire. 


While formally interesting, the mere existence of a vari¬ 
ational function for excited states is not useful without 
an efficient way to evaluate and minimize it. Indeed, the 
presence of makes the straightforward evaluation of 
O drastically more expensive than the ground state func¬ 
tion E, which is why studies that have worked implicitly 
with this function in the past [min] have, to the best of 
our knowledge, always approximated this term (see dis¬ 
cussion of harmonic Ritz methods below). Here we avoid 
explicitly squaring E[ by resolving identities via complete 
sums over states, 
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which we may evaluate (up to a controllable statisti¬ 
cal uncertainty that obeys the zero variance principle) 
through Monte Carlo integration as 






= ~ —rAhb—■ 
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where the elements of ^ are sampled from |(m|4')p via a 
Metropolis walk (note that the normalization constants 
for numerator and denominator cancel). Thus any ansatz 
admitting efficient evaluations for Wm will be compatible 
with our approach. This includes the wide class of wave 
functions already used in ground state VMC, such as 
Slater-Jastrow (SJ) [5], multi-Slater-Jastrow (MSJ) [18], 
and the Jastrow antisymmetric geminal power (JAGP) 
[roi - El] as well as the MPS. Moreover, the method is 
applicable to real space (in which case to is a position 


vector) and Fock space (in which case to is an occupa¬ 
tion number vector), in open systems (e.g. chemistry) or 
period ones (e.g. materials’ band gaps), thus enabling es¬ 
sentially all of the tools we have for ground state VMC to 
be employed in the direct optimization of an individual, 
w-targeted excited state. 

To this purpose, we now present a minimization 
method analogous to the linear method (LM) |25l [26] 
used for ground states. Replacing the original ansatz 
with a linear combination of itself and its first deriva¬ 
tives with respect to its parameters u, 

|4/)^^a,|vl/*), (7) 

i 

|4'°) = |4') = dldui\'^) V i>0, (8) 

we may then minimize H with respect to a by solving the 
generalized eigenproblem 

^(vl/*|[(a;-i7) - (9) 
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Assuming we were already near the minimum, in which 
case all a elements except oq will be small, then we may 
use a to update u through a reverse Taylor expansion 
exactly as in the traditional LM (in practice we may also 
shift the eigenproblem as in the LM to ensure this as¬ 
sumption is valid). This entire procedure is then iterated 
in similar fashion to Newton’s method until the mini¬ 
mum of H is reached, thus optimizing both the linear 
and nonlinear parameters of our original ansatz Idt). As 
the matrix elements for the eigenproblem can be evalu¬ 
ated through the same stochastic identity resolution as 
described above, we arrive at a full-fledged and efficient 
method for the evaluation and minimization of H for any 
ansatz that can be efficiently used with the ground state 
LM. The precise polynomial cost scaling will of course de¬ 
pend on the choice of 4', with examples including NgN^ 
for real space SJ and JAGP and NsNf for Hilbert space 
JAGP, where Ng and Ng are the number of samples and 
electrons, respectively, both of which will grow linearly 
with system size. 

Note the similarity of this eigenvalue equation to the 
harmonic Davidson equation that arises in applications 
[l6l nn [27l [28] of the harmonic Ritz principle [591 ISO] for 
targeting interior eigenvalues of a matrix. In fact, some of 
these approaches min] appear to have been minimizing 
an approximation to D with respect to linear parameters, 
in which PH^P was approximated by PEIPHP, where 
P is the projector into the subspace corresponding to the 
linear parameters in question. Except for its controllable 
statistical uncertainty, the present approach makes no 
approximation when evaluating fl and can optimize both 
linear and nonlinear parameters alike. 

Before discussing results, we should consider the formal 
consequences of employing an approximate ansatz. First, 
the excitation energies we report are taken as the differ¬ 
ence in £'(4') between the ground and excited states. One 
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could instead compute energies as i?a = uj — on the 
assumption that each was an exact eigenstate, but we 
find this approximation to be much less accurate than 
computing E{'^) directly. Second, due to the nonzero 
energy variance of an approximate ansatz, the w value 
at which fi’s global minimum switches states tends to lie 
lower in energy than it would for an exact ansatz, a trend 
that all of our results display. Third, as occurs for 
in the ground state, the global minimum of n(4') for an 
approximate ansatz may break symmetry. Fourth, the 
SPs of n will not necessarily be orthogonal to one an¬ 
other, either within those for a single shift uj or between 
different shifts, but they may be orthogonalized after the 
fact by diagonalizing the Schrodinger equation within 
the basis they define. In the present results, doing so 
was only necessary for CH 2 ’s 5th and 6th excited states, 
which our method found as symmetry-broken mixtures 
of one another and for which we report the post-2-state- 
diagonalization energies. While these various compli¬ 
cations are undesirable, one should remember that the 
ground state function E{'^) has been enormously suc¬ 
cessful despite suffering the same deficiencies. Indeed, 
the whole point of the variational principle is that as 'k 
becomes more flexible these issues must abate, and so 
just as in the ground state case, our excited state vari¬ 
ational principle offers a systematically improvable path 
towards resolving its own difficulties. 

As a demonstration in Fock space, we have used the 



Excited State Number 

FIG. 1: Singlet excitations for CH 2 in a STO-3G basis. Lines 
mark oj values. Asterisks mark doubly excited states. 


method to optimize the Hilbert space JAGP ansatz [21] 
for singlet excited states in CH 2 (Figure [^, an Hg ring 
(Figure]^, and C 2 (Figure]^. (Full computational de¬ 
tails for all demonstrations are available in the appendix.) 
In CH 2 , the two doubly excited states are absent in CIS 
due to HF’s limited LR space and are treated poorly by 
EOM-CCSD. While CCSD’s LR space contains doubles, 
it lacks the triples necessary to describe the orbital relax¬ 
ations that should accompany the excitation. Although 
JAGP’s LR space also lacks triples, it agrees much better 
with full Cl (FCI) IS], because the variational minimiza¬ 
tion of H explores regions of parameter space beyond the 
LR regime. The excited states of Hg were more compli¬ 
cated, each having FCI expansions with 12 or more deter¬ 
minants with normalized coefficients above 0.1 as com¬ 
pared to 8 or less for CH 2 . Nonetheless, the same pattern 
emerges: the large errors in EOM-CCSD are reduced by 
an order of magnitude in variationally optimized JAGP. 

C 2 provides further evidence of JAGP’s superiority to 
EOM-CCSD for double excitations while also revealing 
the limits of the ansatz’s flexibility. While JAGP deliv¬ 
ers 0.1 eV accuracy versus FCI for excited states 1, 2, 4, 
and 5, it shows an error almost as large as EOM-CCSD 
for state 3, a complicated excitation involving four dif¬ 
ferent electrons in a mixture of double excitations. Even 
with this difficulty, which arises due to the inherently 
two-electron nature of the ansatz, JAGP proves more ac¬ 
curate than EOM-CCSD for each of the first five singlet 
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FIG. 2: Singlet excitations for Hg in a 6-31G basis. Lines 
mark uj values. Asterisks mark doubly excited states. 
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excitations of C 2 . 

To showcase the method’s systematic improvability 
and compatibility with a real space Monte Carlo walk, 
we have also treated C 2 with a MSJ ansatz consisting 
of short configuration state function (CSF) expansions 
and spline-based 1- and 2-body Jastrow factors (Fig¬ 
ure 1^. For each state, we selected CSFs with coeffi¬ 
cients above a given threshold from a complete active 
space (CAS) wave function, leading to fewer than 10 
(65) CSFs per state for a threshold of 0.1 (0.01). Un¬ 
der variational optimization (with the random walk now 
in real space), the worst-case MSJ excitation energy er¬ 
ror is found to drop from 0.3 to 0.1 eV upon lower¬ 
ing the threshold, as expected for a systematically im¬ 
provable method. As a benchmark we use Davidson- 
corrected multi-reference Cl (MRCI-I-Q) in a triple-zeta 
basis, which for excited state 5 (the state) is within 
0.03 eV of the recent quadruple-zeta DMRG [31] and FCI 
quantum Monte Carlo (FCIQMC) [52] benchmarks. Sig¬ 
nificantly, our MSJ result for this state (2.57eV) is within 
0.1 eV of these benchmarks (2.47eV), despite contain¬ 
ing fewer than 100 variational parameters, compared to 
more than 4,000 in EOM-CCSD, millions in DMRG, and 
2,000 in the FCIQMC trial function. This success, along 
with MSJ’s high accuracy for C 2 ’s other excited states, 
demonstrates the advantage of optimizing an ansatz di¬ 
rectly and variationally for an individual excited state. 

We have presented a Monte Carlo method for the ef¬ 
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FIG. 3: Singlet excitations for C 2 in a 6-31G basis. Asterisks 
mark doubly excited states. 


ficient, variational optimization of a function D whose 
global minimum can be tuned to target individual ex¬ 
cited states and which may be used at polynomial cost 
with a wide range of approximate wave functions. In 
demonstrations on three small molecules with low-lying 
doubly excited states, the method’s ability to explore an 
ansatz’s full variational freedom allows for drastic im¬ 
provements in accuracy compared to linear response the¬ 
ories such as EOM-CCSD, which stands as the current 
state-of-the-art in polynomial-cost methods for excited 
states in chemistry. Eurther, we have shown that for 
the notoriously difficult double excitations of the car¬ 
bon dimer, variational optimization allows a very mod¬ 
est multi-Slater Jastrow expansion to achieve accuracies 
on par with the much more cumbersome DMRG and 
ECIQMG benchmarks. Given the central importance of 
double excitations in light harvesting and excited state 
absorption experiments, the method’s compatibility with 
periodic boundary conditions and thus the solid state, 
its systematically improvable nature, and its direct con¬ 
nection to the most widely used method in ground state 
electronic structure, we look forward to further exploring 
its usefulness in modeling challenging excited states. 

We acknowledge funding from the Office of Science, 
Office of Basic Energy Sciences, the US Department of 
Energy, Gontract No. DE-AG02-05GH11231. 
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FIG. 4: Singlet excitations for G 2 in a cc-pVTZ basis with 
MSJ in real space. Asterisks mark doubly excited states. 
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Appendix A: General Information 

EOM-CCSD and FCI results were computed with 
MOLPRO [33], CIS results with QChem [Mll^, MSJ re¬ 
sults with a modified version of QMCPACK |37j with 
the CAS truncation taken from GAMESS [ 55 ], and JAGP 
results with our own prototype Hilbert space quantum 
Monte Carlo code with one- and two-electron integrals 
imported from Psi3 [ 55 ] ■ In JAGP we worked exclusively 
in the symmetrically orthogonalized one particle 

basis and froze the C Is orbital at the HE level. All sta¬ 
tistical uncertainties were converged to less than O.OleV 
in all cases. 


Appendix B: CH2 

Eor CH 2 we used a minimal STO-3G basis set [3^ and 
shifts in Hartree of w = -38.4, -38.3, -39.198, -38.15, - 
38.110, and -38.1 for excited states 1 to 6, respectively. 
As mentioned in the main text, this resulted in minima 
of H for the last two shifts that corresponded JAGPs that 
were symmetry broken combinations of the 5th and 6th 
excited states. The numbers we report for the excitation 
energies are those after rediagonalizing the Schodinger 
equation in the basis of these two JAGP wave functions, 
which restores the desired symmetry. Finally, in A, the 
CH 2 geometry was 

C -0.0722376285 -0.0574604043 0.0000000000 

H -0.0198102890 1.0990427214 0.0000000000 

H 1.0664179823 -0.2665333714 0.0000000000 


Appendix C: Hg 

For Hg we used the 6-31G basis [H] and shifts in 
Hartree of w = -3.17, -3.15, and -3.15 for excited states 1 
to 3, respectively. The geometry for Hg was chosen as a 
regular hexagon with edge lengths (i.e. bond distances) 
of 1.5 A. 


Appendix D: Fock Space C2 

For G 2 with a Fock space random walk we used the 
6-31G basis [41] and shifts in Hartree of w = -74.85, - 
74.85, -74.65, -74.60, and -74.60 for excited states 1 to 
5, respectively. Note that these shifts were not plotted 
in the main text as they are all below the JAGP ground 
state energy of -75.5915 Hartree and would not conve¬ 
niently fit on the plot. Recall that the finite variance of 


an approximate wave function causes the values of oj at 
which the H-minimum switches states to shift down in 
energy, and indeed in G 2 this effect appears to be large 
enough to push the switching energies below the ground 
state energy. None the less, when H is minimized for 
these shifts and then the energy £’('!') is calculated for 
the resulting JAGP states, the results are those plotted 
in the main text. In future work, we believe it will be pos¬ 
sible to build modified versions of H that share its formal 
properties while also supressing this “early switching”, 
but for now we simply choose shifts based on where the 
H-minimum switches states. The G 2 bond distance was 
1.2425146399 A. 


Appendix E: Real Space C2 

For G 2 with a real space random walk we used the 
cc-pVTZ basis [35], both for the orbitals of the MSJ 
wave function and for the GIS, EOM-GGSD, GASSGF, 
and MRGI-I-Q calculations. The GAS expansion from 
which GSFs were were taken for MSJ was the GAS 
space resulting from an equal weighted state averaged 
(8e,8o) GASSGF calculation including the ground state 
and the first 5 singlet excited states. The GSF orbitals 
were taken as the optimized GASSGF orbitals. The Jas- 
trow factors (one each for electron-nuclear, opposite-spin- 
electron, and same-spin-electron) were one dimensional 
functions of the magnitude of the interpartical distance, 
the natural logarithms of which were parameterized as a 
10-section bspline with a cutoff radius of 5 Bohr for the 
electron nuclear and 10 Bohr for the electron-electron. 

We found that the downshifting of the switching val¬ 
ues of OJ was even more severe in real space, and that a 
significant difference was seen when finding the ground 
state by minimizing £'(4>) versus H(4') (note that no sig¬ 
nificant difference of this type was seen in the Fock space 
cases). For consistency, we minimized H(4>) for all states, 
including the ground state, whose shift oj was chosen to 
lie near the point at which the minimum switched to the 
first excited state. We observed that so long as it was 
near the switching point, the precise choice of oj in the 
ground state optimization had only a very minor effect 
on the predicted excitation energies, whereas optimizing 
i?('I') instead (equivalent to choosing oj = — 00 ) produced 
a substantial ground state bias. The u values used for 
the reported excitation energy calculations were -79.15 
for the ground state and -79.00, -79.00, -78.70, -78.70, 
and -78.68 for excited states 1 to 5, respectively. The G 2 
bond distance was 1.2425146399 A. 
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